Rank in Wordlist | Frequency | Word |
---|---|---|
998 | 1 | 1,5 |
1025 | 1 | 1989,umubare |
1028 | 1 | 2,5 |
1041 | 1 | 3,7 |
1063 | 1 | 46,048 |
1072 | 1 | 8,1 |
1078 | 1 | 98,3 |
1258 | 1 | ISAR),icyo |
1340 | 1 | Jean,akaba |
1349 | 1 | Karama,ariko |
Rank in Wordlist | Frequency | Word |
---|---|---|
3142 | 1 | n’ubuke)(izina |
Rank in Wordlist | Frequency | Word |
---|---|---|
1258 | 1 | ISAR),icyo |
1568 | 1 | abacoloni)kugeza |
3142 | 1 | n’ubuke)(izina |
Rank in Wordlist | Frequency | Word |
---|---|---|
714 | 2 | d'érable |
861 | 2 | nk'uko |
1294 | 1 | Ikoranabuhanga mu Itumanaho n'Isakazabumenyi |
1736 | 1 | b'abakirisitu |
2037 | 1 | bw'ubuntu |
2090 | 1 | by'ubuntu |
2091 | 1 | by'umwihariko |
2158 | 1 | cy'imibereho |
2803 | 1 | kw'ingagi |
2804 | 1 | kw'umutsi |
Rank in Wordlist | Frequency | Word |
---|---|---|
603 | 2 | UN/ONU |
1076 | 1 | 9/10 |
2372 | 1 | http://rwanda |
2908 | 1 | movement/mouvement |
3187 | 1 | philosophy/Philosophie |
3192 | 1 | power/droit |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots